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IV.  SYNTHESIS  OF  CROSS-SECTIONAL  AND  LONGITUDINAL  MODELS 


1 . Introduce  Ion . 

This  chapter  examines  the  relationships  between  the  cross-sectional 

i 

models  developed  in  Chapter  II  and  the  longitudinal  models  developed  in  ' 

Chapter  III.  The  longitudinal  models  allow  more  general  flow  processes  to  be 

i 

modelled,  and  any  cross-sectional  model  is  a special  case  of  a iorigitudinal  \ 

model.  Although  the  longitudinal  models  are  more  general,  they  normally  have  ^ 

much  greater  data  requirements  and  thus  are  mere  difficult  to  implement  in 
cases  where  the  model  coefficients  are  estimated  from  historical  data.  There- 
fore wij  seek  some  compromise  between  the  basic  longitudinal  and  cross  sectional 
models . 

The  chapter  begins  with  a brief  secti >n  demonstrating  some  relationships 
between  the  two  models.  Sections  3 and  4 present  hybrid  models  that  use  cross- 
sectional  data  yet  have  some  longitudinal  characteristics.  Section  3 describes 
two  characteristic  models.  These  large  cross-sectional  models  have  a special 
structure  which  allows  for  simple  calculations  and  modest  data  requirements. 

Section  4 considers  semi-Markov  models  which  are  a straight  forward  extension 
of  the  cross-sectional  model.  We  find  that  the  special  structure  of  the  semi- 
Markov  model  yields  some  useful  approximations.  Finally,  section  5 is  devoted 
to  a theoretical  analysis  of  the  longitudinal  model  and  the  analysis  of  errors 
caused  by  using  a best  approximating  cross-sectional  model. 

In  this  chapter  we  modify  our  previous  notational  conventions.  When  it 
simplifies  the  exposition  we  assume  that  the  longitudinal  matrices  P(u)  will 
have  index  u for  all  u greater  than  or  equal  to  zero.  In  previous  chapters 
we  assumed  that  P(u)  = 0 for  u > M.  This  case  is  still  included  of  course, 
but  allowing  u to  range  over  all  positive  values  often  simplifies  the  limits 
on  summations  in  complicated  expressions.  We  also  use  the  probabilistic 


I 


interpretations  of  the  cross-sectional  and  longitudinal  models.  With  the 
exception  of  section  5 all  the  arguments  could  be  reworded  in  terms  of  fractional 
flows.  Howe\/er , the  use  of  the  probabilistic  nomenclature  eases  rhe  discussion 
and  simplifies  some  of  the  arguments. 


I 


2.  Relations  Between  Cross-Sectional  and  Loneitudinal  Models. 


This  section  contains  an  analysis  of  the  relations  between  cross-sectional 
and  longitudinal  models.  It  starts  with  the  introduction  of  an  expanded  classi- 
fication scheme  which  connects  the  two  models.  This  leads  us  to  examine  several 
practical  considerations  in  class  expansion.  A detailed  theoretical  analysis  of 
model  comparison  is  given  later  in  section  5. 

In  order  to  use  the  cross-sectional  models  described  in  Chapter  II  one  must 
first  select  a suitable  manpower  classification  scheme.  In  general  one  selects 
the  simplest  scheme  that  will  answer  specific  interesting  questions,  and  stay 
consistent  with  available  data.  It  may  be  helpful  to  expand  the  classification 
scheme  to  develop  a more  realistic  model  of  the  flow  process. 

The  cross-sectional  data  found  in  most  organizations  often  contains  limited 
longitudinal  information.  For  example,  in  a faculty  promotion  model  such  as  that 
described  in  II. 8,  the  data  on  individual  faculty  members  probably  contains, 
in  addition  to  current  rank,  the  length  of  time  in  the  organization,  or  length 
of  time  in  the  current  rank.  This  data  often  indicates  how  a simple  classifi- 
cation scheme,  such  as  rank,  can  be  expanded  to  more  realistically  model  personnel 
flows.  We  exploit  this  idea  below,  but  first  we  see  how  a general  longitudinal 
model  can  be  rearranged  and  thought  of  as  a cross  sectional  model. 

Recall  from  the  general  longitudinal  model  in  III. 2 that  the  input  flows 
on  chains  1 through  K in  period  t arc  given  by  the  K-vector  g(t),  and  the 
maximum  number  of  periods  spent  in  the  system  is  M + 1.  Suppose  that  we  define 
a class  to  be  a combination  of  chain-type  and  period  of  en,ry.  Then  we  have 
K (M  + 1)  classes.  Let  the  "stocks"  at  time  t be  given  by  the  K x (M  + 1)- 


vector  of  past  chain  input  flows  (g(t),  g(t-l) , . . . ,g(t~M) ] , and  Q be  a 
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K -<  (M  + ])  square  matrix  with  zeros  exceat  for  I's  on  the  K-th  lower  di  igonnl, 
If  0 represents  a K K zero  matrix,  and  I a K K identity  matrix, 
then  for  M = 3 , 


Q = 


0 0 0 0 
10  0 0 
0 10  0 
0 0 10 


Let  f(t)  be  a K > (M  + l)-vector  whose  first  K elements  are  g(t)  and  the 
remainder  all  zeros.  Then 


s(t  + 1)  = Qs(t)  + f(t) 


and  we  have  a cross-sectional  formulation.  However,  the  model  is  sliriply  a 
reorganization  of  the  general  longitudinal  model.  We  now  i./wk.  at  some  particular 
cases  of  more  interest. 


Suppose  P(0)  is  a given  (N  x K)  matrix  and  P(u  + 1)  = QP(u),  where 

u+1 , 


0 is  an  N ■■  N matrix.  Then,  for  all  u,  P(u  + 1)  = 0 PfO),  and  using 
equar  ion  (4)  in  111.2, 


s(c)  = P(0  g(t)  + Q [ o'"  ^P(0)g(t  - u) 

u=l 


(1) 


- Os,t  - 1)  + P(0)g(t)  . 


This  is  a cross-sectional  model  with  f(t)  = P(0)g(t). 


A converse  to  this  result  is  also  true.  Suppose  s(t)  - P(0)g(t)  = Qs(t  - 1) 


u+1. 


for  any  values  of  g(t  - u) , u ^ 1.  Than  we  must  have  P(u  + 1)  = Q P(0). 


To  see  this  set  g(t  - u)  = 0,  except  when  u = k.  Then  s(t  - k)  = P(0)g(t 


- k) 


and  s(t)  = P(k)g(t  - k)  = Q P(0)g(t  - k) . Since  g(t  - k)  is  arbitrarv. 
we  must  have  P(k)  = qS’(O).  Thus  we  have  shown  the  longitudinal  and  cross- 


sectional  models  are  identical  If  and  only  if  f(t)  = P(0)g(t)  and  P(u  + 1)  = 

..u+1 


P(0)  for  all  u a 0. 


li'i  I li  lit 


. > o.. 


Problem  1:  If  P(u  + 1)  = Q P(0),  and  the  maximum  number  of  periods  in  the 


system  is  M + 1,  what  limitations  does  this  place  on  the  structure  of  Q? 


Returning  to  the  expansion  of  the  classification  scheme  suppose  that 


we  have  a longitudinal  model  with  N classes,  and  maximum  time  in  system  equal 


to  (M  + 1)  periods.  A class  is  now  redefined  to  be  a combination  of  an  original 


class  i and  a length  of  completed  service  u.  Thus  there  are  N x (M  + 1) 


new  classes,  and  the  stocks  in  these  classes  are  given  by  the  vector  [s^(t;u)]. 


for  i = 1,2,. ..,N,  ar.d  u = 0,1,2,. ..,M.  Consider  first  the  special  case 


where  the  number  of  original  classes  N is  equal  to  the  number  of  chains  K. 


Thus  the  matrices  P(u)  in  the  longitudinal  model  are  each  square. 


Define  q^^Cu)  as  the  fraction  of  those  in  original  class  i v;lth  u 


periods  of  completed  service,  who  move  to  original  class  j in  one  period.  Then 


for  each  k = 1,2,. ..,K, 


P(u  + 1)  = Q(u)P(u)  . 


If  P(u)  has  an  inverse,  then 


Q(u)  = P(u  + l)P(u)  for  u = 0,1,. .,,M  - 1 


In  i.his  case,  the  cross-sectional  model  is 


s(t  + 1;0)  = g(t  + 1)  , 


s(t  + 1;  u + 1)  = Q(u)s(t;u)  u = 0,1,. ..,M  - 1 . 


Example  1 ; 


In  the  one  class  one  chain  model  (K  = N = 1)  we  have  q(u)  = p(u  + l)/p(u) 


If  p(0)  = 1,  and  p(u)  is  nonincreasing,  then  0 < q(u)  < 1 . The  numbers 
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q(u)  are  commonly  called  continuation  rates,  since  q(u)  gives  the  fraction 
of  people  who  continue  in  the  system  for  at  least  (u  + 1)  periods,  given  that 
thev  have  been  in  the  system  u periods.  n 

More  generally,  when  N K,  we  can  choose  Q(u)  so  that  Q(u)P(u) 
approximates  P(u+  1),  This  can  be  accomplished  if,  for  each  j = 1,2,...,N, 
we  solve  the  quadratic  minimization  problem: 


Minimize  \ v^ 


where 


The  matrix  Q(u)  which  solves  this  problem  Is  gi\/en  by 


Q(u)  = P(u  + l)P(u)  , 

where  P(u)  is  the  generalized  Inverse  of  P(u).  Hov/ever,  there  is  no  guarantee 
(J(u)  will  be  nonnegative  with  column  suras  less  than  one. 

We  close  this  section  wi l1i  a practical  discussion  of  how  a model  with 
longitudinal  features  can  be  modified  to  seem  more  like  a cross-sectional  model. 

It  seems  best  to  establish  t’ is  point  by  example. 

Example  2:  Consider  the  three  class  cross-sectional  faculty  model  in  example 

1 of  II. 3.  Given  an  individual  enters  class  1,  the  individual  can  move  eventually 


to  class  0 or  2.  The  expected  duration  in  class  1 is 


If  we  ask  for 


the  expected  duration  conditioned  on  moving  to  class  0 (is  not  given  tenure) 
the  answer  is  still  q — — . The  same  answer  will  be  obtained  if  we  ask  for  the 
expected  lifetime  in  class  1 given  eventual  promotion  to  class  2 (is  given  tenure). 
The  Markov  model  treats  a visit  to  class  1 as  a two-stage  process,  as  is  illustrated 


in  Figure  IV. 1. 


Class  1 


^ - % 


Reclassify 


Class  0 


Class  2 


’21  ■"  “l 


Figure  IV. 1 . Illustration  of  Markov  Model  in  Example  2. 

At  the  first  node,  tiie  individual  either  stays  in  class  1 or  not  and  the 
expected  number  of  periods  at  class  1 is  independent  of  the  reclassification  process. 

Suppose  we  know  that  the  lifetimes  of  individuals  in  class  2 are  dependent 
on  their  eventual  status.  Let  Tq  be  the  expected  lifetime  in  class  1 given  an 
eventual  move  to  class  0,  and  be  the  expected  lifetime  in  class  1 given  an 

eventual  move  to  class  . We  can  construct  a four  class  cross-sectional  model  that 
has  these  <~haracter  ist  ics  : 


New  Class 

Old  Class 

1. 

Nontenure 

who 

leave 

1. 

Nontenure 

2 . 

Nontenure 

who 

move  to  tenure 

3. 

Tenure 

2. 

Tenure 

4. 

Retired 

3. 

Retired 

The  new  system  will  be  distinguished  by  a 

s(t)  = Qs(t  - 1)  + f (t) 


We  assume  that 


a 


w 


and 


Q = 


1 

f^Ct)  , 

V‘>21 

f^Ct) 

‘'21 

f^Ct)  , 

Wi+q2i 

0 

0 

"o 

0 

T2-I 

^2 

T 

2 

0 

0 

^22 

0 

0 

0 

0 


^23 


^23 


f 

J 


This  expanded  model  makes  the  distinction  wc  U>.e  spent  .n  nontenure, 
and  It  also  tells  us  the  fraction  of  professors  in  nontenure  tnat  eventually  acquire 
tenure,  namely  §2 (t) / (t)  + §2(t)). 


3.  Two-Characteristic  Cross-Sectional  Models. 


This  section  examines  cross-sectional  models  with  two  dimensional  state  spaces 
using  the  probabilistic  interpretation  presented  in  III. 9.  Assumptions  on  per- 
missible flows  between  states  lead  to  a special  structure,  and  this  in  turn  allows 
simple  calculation  of  quantities  such  as  projected  inventories  and  lifetime  in  each 
classification. 

The  key  to  the  special  structure  is  the  organization  cf  the  classification 
scheme.  The  classes  (or  states)  are  defined  in  terms  of  two  characteristics, 

(i,j),  where  the  first  characteristic  (henceforth  FC) , i,  runs  over  the  indices 

1 through  N.  The  range  of  the  second  characteristic  (henceforth  SC),  j,  depends 

on  the  FC.  Let  S be  the  set  of  all  possible  classes,  and  5(i)  = {j|(i,j)e5} 
be  the  set  of  possible  SC's  given  that  the  FC  is  i.  Let  l-?(i)  | be  the 
number  of  elements  in  the  set  5(1). 

At  time  t an  individual's  class  can  be  described  by  a random  variable  X(t). 

The  cross-sectionai  assumption  assures  us  that  knowledge  of  X(t)  is  sufficient 

for  prediction  of  X(t  + 1),  X(t  +2),  etc.,  without  knowledge  of  X(t  - 1), 

X(t  - 2),  etc.  To  obtain  the  special  structure  of  the  two  characteristic  model 
we  impose  limitations  of  the  allowable  transitions  between  classes.  If  the  current 
FC  is  i,  the  only  allowable  moves  in  one  period  are 

(i)  to  classes  with  FC  still  equal  to  i, 
or  (ii)  to  classes  with  FC  equal  to  i + 1. 

Example  3:  Let  the  FC  repri  sent  length  of  time  in  system  and  SC  the  grade  of 
an  individual.  Consider  ttie  four  grade  student  example  with  grades  j = 1,2, 3, 4, 
for  freshman,  sophomore,  junior  and  senior  respectively.  Clearly  in  each  time 
period  the  first  characteristic  increases  by  1.  Let  the  maximum  time  in  the  system 
be  5 years  (1  year  = 1 time  period),  and  let  the  sets  of  classes  be 


■:v{7?(i.'S:-V^.'f?/i;’^?T;^^ 


{1.2} 

{2,3, 

{3,4} 

{4,5} 


This  is  an  example  of  the  'LOS/GRADE'  model.  Note  that  N = 5,  and  |5|  = 9. 
ProW.emJ:  List  all  the  chains  which  would  be  present  if  Pxpn,ni.  t 


present  if  example  3 were  re-formulated 


as  a longitudinal  model. 


since  the  two-characterlstlc  model  Is  of  the  cross-sectional  type  It  must 
be  defined  by  a transition  matrix  Q,  where  Q Is  square  with  each  dimension 

equal  to  |S|.  We  consider  the  two  types  of  allowable  flow  separately. 

(i)  No  change  in  FC  i. 

Define  for  each  j and  m in  5(i) , 

q^j(i)  = P[X(t  + 1)  = (i,m)|x(t)  = (i,j)J  , 
and  let  q(l)  be  the  |s(J)!  by  |s(l)|  matrix  with  (m.J)-th  element  equal  to 

(ii)  Change  from  FC  i to  FC  (1+1), 

Define  for  each  m in  5(i  + 1)  and  each  j in  S(i) , 

P^j(i)  = P[X(t  + 1)  = (i  + l,m)jX(t)  = (i,j)]  , 
and  let  P(i)  be  the  |5(i  + 1) | x |5(i) | matrix  with  (m,j)-th  element  equal 
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The  Q matrix  is  given  by  (fo-  N = 4) 


(2) 


Q = 


Q(l) 

0 

0 

0 

P(l) 

Q(2) 

0 

0 

0 

P(2) 

Q(3) 

0 

0 

0 

P(3) 

Q(4) 

where  the  O' s are  matrices  with  all  elements  equal  to  zero. 

Example  4 : Continuation  of  example  3. 

Since  the  LOS  must  increase  by  1 each  year  all  the  Q(i)  matrices  are  zero 
matrices.  Thus  Q has  the  structure 


0 I 0 0 

X I 0 0 

X I 0 0 


0 I X 
0 ! 0 


0 I 0 
0 1 0 
0 I 0 
0 I 0 


X 

X 

0 

0 

0 

0 


0 

0 

0 

0 

0 


0,0  0 
0 
0 


0 

0 


0 

0 

0 


0 

0 


0 0 
0 0 


0 0 
0 0 


0 0 
0 0 


X X 

0 0 


0 

0 

0 

0 

0 

0 

0 

0 

0 


0 

0 

0 

0 

0 

0 

0 

0 

0 


where  x indicates  a (possibly)  non-zero  element.  The  partitioning  is  included 
to  help  the  reader  identify  the  P(i)  matrices. 

Example  5;  Re-formulation  of  example  3. 

Suppose  that  the  FC  represents  the  grade  of  an  individual  in  a system  where 


no  demotions  can  occur  and  in  which  a person  cannot  advance  more  than  one  grade 
per  year.  Let  SC  represent  the  time  spent  in  the  particular  grade.  This  is 
called  the  ' GRADE/TIME-IN-GRADE ' model.  Let  the  grades  be  1)  freshman,  2)  sophomore 


3)  junior 

and 

4)  senior. 

and  let  the  maximum 

t ime  in 

each  grade  be  2 years 

we  have 

i 

S(i) 

1 

{1,2} 

2 

(1,2) 

3 

{1,2} 

4 

{1,2} 

Note  that 

N 

= 4 and 

l/.'l  = 8.  Now  the  Q 

matrix 

has  the  structure. 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

A 

0 

1 

0 

0 

0 

0 

0 

0 

X 

1 

0 

0 

0 

0 

i 

0 

0 

0 

0 

1 

X 

0 

0 

0 

1 

0 

0 

0 

0 

1 

X 

X 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

X 

0 

1 

0 

0 

0 

0 

i 

0 

0 

X 

X 

i 

0 

0 

0 

0 

1 

0 

0 

C 

0 

1 

X 

0 

where  again  x indicates  a (pc'^si bly)  non-zero  element. 

Example  6:  Re-formulation  nf  example  3. 

Suppose  that  the  FC  represents  the  grade  of  an  individual  (as  in  exarple  5) 
in  a system  with  no  demotions  and  no  double  or  multiple  promotions  per  period. 

Let  the  SC  represent  the  time  in  the  system,  or  length  of  service  (LOS).  This 
is  called  the  'GRADE/LOS'  model.  Let  the  grades  be  1)  freshman,  2)  sophomore, 

3)  junior  and  4)  senior,  and  let  the  maximum  time  in  the  system  be  5 years,  with 


{1,2) 

{2,3} 

{3,4} 

{4,5} 


Now  the 

Q 

matrix 

has 

the  s 

0 

0 

0 

0 

0 

! 0 

0~ 

0 

1 

0 

0 ' 

0 

0 

0 

0 

1 

1 o 
(_ 

1 o 

t 

0 1 

0 

0 

T - 
0 

1 

0 

X 

' X 

1 

0 

0 

0 

■ 0 

0 

T - - 

— -i 

" - 

- 

r - 

- - 

0 X 

1 

“ 1 

0 

0 

' 0 
1 

0 

0 1 

1 0 

X 1 

x 

0 

1 0 

0 

•+ 

~ 1 

- - 

- 

t - 

— 

0 1 

0 

0 

X 

0 

1 0 

0 

0 , 

0 

0 ‘ 

0 

X 

' X 
1 

0 

—I  B 

All  the  above  eaaoplee  display  the  special  structpre  of  q „hleh  la  depicted 
in  (2).  Recall  ft„„  chapter  I,  that  „a„,  applications  of  the  cross-sectional 
»odel  repnlre  calculation  of  the  Inverse  „Kfch  „e  called  D.  Although 

the  q catrlx  In  the  two-characterlst Ic  model  la  often  quite  large,  it  la  easy 
to  calculate  D in  terms  of  th.  inverses  of  the  smaller  submatrices.  Define 
D(i)  = (I-Q(i))-1  for  each  FC  i.  Then  (for  the  case  N = 4). 


D(l) 

D(2)P(1)D(1) 

D(3)P(2)D(2)P(1)d(1) 

D(4)P(3)D(3)P(2)D(2)P(1)D(1) 


0 

D(2) 

D(3)P(2)D(2) 

D(4)P(3)D(3)P(2)D(2) 


0 

0 

D(3) 

D(4)P(3)D(3) 


Thus  D is  completely  determined  by  the  matrices  D(i)  , i = 1,...,N,  and  P(i), 


i = 1,2,. . . ,N  - 1. 


Computations  in  forecasting  are  considerably  reduced  by  taking  advantage  of  the 


special  structure.  Let  s^(t)  be  the  vector  of  stocks  at  time  t with  FC  i. 
Thus  s^(t)  is  a |S(i)|  vector.  Then  the  stocks  at  (t  + 1)  are  given  by 


s^(t  + 1)  = Q(i)s^(t)  + P(J-l)s^_j^(t)  + fy^(t  + 1),  i = 2 N 


where  t_  (t)  is  the  vector  of  input  flows  in  period  t with  FC  i.  The  total 


stocks  at  (t  + 1)  with  FC  i is  found  by  summing  the  elements  of  • 


Problem  3:  Let  b^^(i)  be  the  probability  that,  given  the  current  state  is 


(i,j),  the  state  entered  on  leaving  5(i)  is  (i  + l,m).  Let  B(i)  = [b^j(i)]. 


Show  that  B(i)  = P(i)L'(i). 


Problem  4:  Lit  b ,(k;i)  be  the  probability  that,  gi/cn  , e current  state  is 

mj 


(l,j),  the  state  entered  when  5(k)  is  entered  ’ ■>  (k,m).  Let  B(k;i)  = [b  .(k:i)], 

®J 


|S(k)i  by  |5(i)|  matrix*  Show  thst  B(i)  ~ B(l.  *4"  I5  j.)  , 3nd  for  k > iH~l 


B(k;i)  B(k  - i)B(k-2)  ...  B(i). 


V- 


15 


4.  Seai-Markov  Flow  Models. 

A slnple  longitudinal  model  that  retains  some  of  a cross-sectional  model's 
useful  properties  Is  the  semi-Markov  model.  This  section  presents  the  general 
Ideas  behind  such  a model  and  Indicates  how  some  useful  quantities  can  be  calculated 
or  approximated  without  completely  specifying  the  flow  process.  We  use  terminology 
from  probability  theory  to  present  the  model,  but  the  reader  should  recall  that  It 
Is  not  necessary  to  view  the  model  In  a probabilistic  sense.  Although  It  can  be 
viewed  as  a deterministic  flow  process  we  find  the  exposition  easier  and  smoother 
using  Markov  chain  terminology. 

Consider  a system  with  N classes  of  manpower.  When  an  Individual  enters 
class  1 we  say  he  comnences  a visit  to  class  1.  Let  q^^Cu)  be  the  probability 
that  a visit  to  class  1 lasts  u perioo^.  and  finishes  with  transition  to  state 
J.  As  In  earlier  chapters  class  0 Is  Interpreted  as  outside  the  system, 
and  since  a visit  to  any  class  Is  assumed  to  be  at  least  1 period  In  length, 
qji(d)  - 0. 

The  probabilities  q^^(u),  1 = 1,2,...,N,  j - 0,1,2,..,,N,  u “ 1,2,..., 
form  the  basic  data  of  the  model,  and  from  these  the  following  Interesting 
quantities  can  be  calculated: 

(1)  the  probability  that  class  J will  follow  class  1, 


u»l 

(11)  the  expected  length  of  a visit  to  class  1,  given  j Is  the  next 
class  visited. 


■ J,  ’ 


(111)  the  expected  length  of  a visit  to  class  1, 


“i  ■ jo  ■ 


w 


I 
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(iv)  the  probability  of  soending  more  than  u periods  in  class  i. 


N 


^1^^)  = I I » 

v=u+l  j=0 


(v)  the  variance  in  the  length  of  a visit  to  class  i,  given  that  the 
next  class  visited  is  j , 


(vi)  the  variance  in  the  length  of  a ^flsit  to  class  i, 


N 


’ i - I I (u  - li  ) q (u) 
u=l  j=0  ^ 


Problem  5:  Show  that 


^i  = I h^(u)  , 


u=0 


and 


2 2 ^ 

<^i  + hj  - Lii  = 2 I uh^(u)  . 

u=0 


Example  7:  Consider  a student  enrollment  model  with  the  following  5 states: 

1.  Freshman 

2.  Sophomore 

3.  Juniors 

4.  Seniors 

5.  Degree  winners  (graduates). 

Assume  that  the  only  transitions  possible  are  from  i to  either  (i  + 1)  or  0, 
and  that  no  state  can  be  held  for  more  than  three  periods.  The  basic  data  are  given 
by  (blanks  indicate  zeros) : 


1 

U 

2 

3 

0.15 

0.10 

0.65 

0.10 

902  ('0 

( 

0.10 

0.05 

0.01 

902(u) 

0.70 

0.10 

0.04 

903(u) 

0.  15 

0.05 

q,3(u) 

0.75 

0.05 

^04^^^ 

0.05 

93^ (u) 

0 . 90 

0.05 

905^") 

1.00 

By  using  (i)  it  is  iMsy  to  .'..Irulate  the  6 x 5 matrix  of  probabilities 
[q^^].  These  are: 


\ i 1 

J V-A.„... 

2 

3 

4 

5 

0 0.25 

• 16 

0.20 

0.05 

1.00 

2 0./')  ' j 

I 

3 0.84  ! j 

' 1 

^ 0.80  I ■ 

i 

5 0.95  ' ; 

' I 

Notice  that  the  elements  in  each  column  sum  to  1.00. 


From  this  cable  we  Sfo  that,  given  a student  will  become  a junior,  the 

expected  time  he  spends  as  a sophomore  is  1.21  periods.  Given  he  is  to  leave 

after  being  a sorliomore,  the  expected  time  spent  as  a sophomure  is  1.44  periods. 

2 

By  using  (v)  the  variances  [o..l  are 


TTie  semi-Markov  model  can  be  viewed  as  a cross-sectional  model  with  a two- 
characteristic  state  space  (the  reader  should  verity  that  the  converse  is  not 
true).  Suppose  that  a new  state  is  defined  to  be  a combination  of  an  original 
state  i and  the  number  of  periods  spent  In  that  state,  u.  Then  an  individual 
in  state  (l,u)  moves  next  either  to  state  (j ,0)  , with  probability 
qjj(u  + l)/h^(u),  or  to  state  (1 ,u  + 1)  (remains  in  the  same  "original  state") 
with  probability  h^(u  i-  l)/h^(u). 

Example  8 ; Continuation  of  example  7. 

In  this  student  example  there  are  10  states  with  a cross-sectional  model 
Q matrix  given  by 


To\^  (1,0)  (1,1)  (2,0)  (2,1)  (2,2)  (3,0)  (3,1)  (4,0)  (4,1)  (5,0) 

(1,0) 

(1,1)  0.20 

(2.0)  0.65  0.50 

(2.1)  0.20 

(2,2)  0.25 

(3.0)  0.70  0.50  0.80 

(3.1)  n.iu 

(4.0)  0.75  0.50 

(4.1)  0.05 

(5,0)  0.90  1.00 


Problem  6:  In  terms  of  the  GRADE/TIME- IN-GRADE  model  described  in  Section  3, 
partition  the  matrix  in  example  8 to  find  the  Q(l)  and  P(i)  matrices,  and 
find  the  inverse  matrix  D = (I-Q)  Interpret  the  result. 


[ 

fi 

t 

I 

! 

t 

t 
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The  semi-Markov  model  can  also  be  viewed  as  a longitudinal  model,  but  in 
order  to  do  this  we  must  identify  the  chains.  Chain  k in  the  longitudinal 
interpretation  corresponds  to  state  k in  the  semi-Markov  formulation.  An 
individual  is  appointed  in  chain  k if  and  only  if  he  enters  the  system  in 
state  k.  Recall  from  III.  5 that  probability  that  an  Individual 

who  enters  on  chain  k in  some  period  t will  be  in  class  i at  time  t + u. 

By  using  conditional  probability  arguments,  when  k is  different  from  i 
we  obtain  from  the  semi-Markov  assumptions. 


p^j^(u)  =0  if  u = 0 , 

u N 

= I I (o-v)q.j^(v)  if  u i 1 . 

V=1  j=l  ^ 

For  the  case  i = k we  have 


p^^(u)  =1  if  u = 0 , 

u N 

= h^(u)  + I I P^.(u-v)q  (v)  , if  u ^ 1 . 

v=l  j=l  ^ 


Now  let  H(u)  be  an  N x N matrix  with  off-diagonal  elements  equal  to  zero, 
and  i-th  diagonal  element  equal  to  h^(u). 

Also  let  P(u)  and  Q(u)  be  N x N matrices  with  (j,i)-th  elements  equal 


to  Pj^(u)  and  q^^Cu)  respectively.  Then  the  above  equations  can  be  written 
in  the  matrix  form 


u 

(3)  P(u)  = H(u)  + ^ P(u  - v)Q(v),  u > 0 . 

v=0 


Since  Q(u)  contains  the  basic  data  of  the  semi-Markov  model,  and  since  H(u) 
is  calculated  from  this  data  using  (iv) , the  longitudinal  model  matrices  P(u) 
are  completely  determined  by  solving  (3) . 
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Example  9 : Continuation  of  example  8. 

For  the  student  example  the  value 
u = 0,1, 2,..., 9 are  given  by  (to  2 significant  figures) 


For  the  student  example  the  values  of  (u)  for  1 ^ 1,2, 3, 4, 5,  and 


Blank  entries  represent  zero's  or  numbers  less  than  .005. 

Problem  7 ; Based  on  example  9 above. 

a)  Given  that  an  individual  enters  as  a freshman,  what  is  the  probability 
of  graduation. 

b)  Given  that  the  entering  freshman  eventually  graduates,  what  are  the 
mean  and  variance  of  the  number  of  years  spent  as  a student? 

c)  Given  that  the  entering  freshman  drops  out,  what  Is  the  mean  and  variance 

of  the  number  of  years  spent  as  a student?  ^ 

If  all  the  basic  data  (the.  q^j(u)'s)  are  known,  equation  (3)  shows  that 

the  longitudinal  model  matrices  P(u)  can  be  calculated  and  all  the  results 

of  Chapter  III  follow.  Often  the  detailed  transition  probabilities  are  not  known, 

2 

and  only  estimates  of  the  means  and  variances  and  can  be  obtained, 

together  with  the  Even  with  this  limited  data  it  is  often  possible  to 

obtain  approximate  results  for  the  equilibrium  behavior  of  the  system. 

00  OO  00 

Recall  that  L = P(u),  and  let  K = H(u),  and  Q = Q(u).  The 

u=0  u=0  u=0 

equations  in  (3)  can  be  written  out  as 


p(0)  = H(0) 

P(l)  = H(l)  + P(0)Q(1) 

P(2)  - ri(2)  + P(1)Q(1)  + P(0)Q(2) 

P(3)  = H(3)  + P(2)Q(1)  + P(1)Q(2)  + P(0)Q(3) 


etc 


Summing  these  equations  and  using  the  above  definitions  we  get 


or 


L = H(I-Q) 


Now  H is  the  sum  of  diagonal  matrices  and  is  itself  a diagonal  matrix  with 
(i,i)-th  element  equal  to  (see  problem  5) . Let  D = (I-Q)~l.  Then 

is  the  expected  number  of  visits  to  state  i given  that  the  system  was  entered 
in  state  k.  Thus 

5..,  = y d . 
ik  ^rik  ' 

where  is  the  expected  number  of  periods  spent  in  class  i,  given  the  system 

was  entered  on  chain  k.  If  a stationary  vector  g gives  the  chain  inflows  in 
each  period  the  steady  state  stocks  will  be 


Example  10:  Continuation  of  example  9. 


1.00 

0.75 

1 . 00 

0.63 

0, 8m 

1 . 00 

0.50 

0 . 67 

0.80 

1.00 

0.48 

0 .64 

0.76 

0.95 

L . 20 


1 .25 


1.10 


1 . 05 


1 .20 

0 . 94 

1.25 

0.69 

0 . 92 

1.10 

0.5! 

0.7  1 

(1 . 8 4 

1 . 05 

0 . 4 8 

0.64 

0.7  6 

0.95 

1.00 


Problem  8:  Based  n;i  .’Aampl,.  lO. 

Assume  that  you  onLcr  this  sLudeni  group  as  a junior, 

a)  how  many  periods  Jo  you  expect  t:o  attend? 

b)  what  is  the  probaliility  that  you  will  graduate? 

Problem  9 : Show  that,  gi\en  you  enter  class  k,  the  probability  of  ever  reaching 

class  i is  d /d,  , . 

ik  kk  □ 

To  continue  witli  the  steady  state  approximations  consider  next  the  case  where 

input  flows  are  growing  geometrically  at  rate  (0-1).  Thus  g(t)  = O^'g  and  from 

equation  7 In  1 I I i lie  si  o.  ks  in  period  i (t  large)  are  given  by 


s(t)  = 0 L(0)g  , 


L(0)  = I 0 ^P(u)  . 
u=0 


P(6)  = I 6^?(u)  , 
u=0 


H(6)  = 5;  6“h(u)  , 

u-0 


Q(6)  = I e'^QCu)  . 
u=0 


By  multiplying  the  u-th  matrix  equation  in  (3)  by  6 and  summing  over  u we  get 


P(6)  = H(6)  + P(6)Q(6)  . 


P(6)  = L(0) 


= H(6)(I  - Q(6))‘ 


For  d close  to  1 the  basic  approximation  formulas  (Appendix  1)  can  be  used 


for  the  elements  of  H(<5)  and  Q(<5).  From  these 


q.i(6)  = q.^6  (1  + y a.^) 


h^(6)  = [1  - d^'^Cl  + I oJ)]/(l  - 6)  . 


where 


a = log^  0,  0=1/6  . 


Example  11:  Continuation  of  example  10. 


Let  0 = 1.03,  so  that  6 = 0.97.  Then 


r-iV»r:>;VV— • t :• 
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D(6)  = 


1.00 

0.72 

1.00 

0.59 

0.81 

1.00 

0.45 

0.63 

0.77 

1.00 

0.42 

0.58 

0.71 

0.92 

1.00 


and 


H(6)  = 


1.19 


P(6)  = 


1.24 


1.10 


1.05 


1.19 

0.90 

1.24 

0.64 

0.88 

1.10 

0.48 

0.66 

0.81 

1.05 

0.42 

0.58 

0.71 

0.92 

1.00 


1.00 


The  actual  values  of  P(6)  are  very  close  to  these  approximations. 
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5.  A Theoretical  Comparison. 

The  stochastic  interpretations  of  the  longitudinal  and  cross-sectional 
models  developed  in  III. 5 and  II. 1 are  used  in  this  section  in  a theoretical 
comparison  of  the  two  models.  Some  data  on  student  enrollment  used  to 
illustrate  the  results. 

Throughout  this  section  we  assume  the  longitudinal  model  is  a valid  descrip- 
tion of  the  system's  lav?  of  motion.  Our  intention  is  to  construct  a good  cross- 
sectional  approximation  to  that  model  and  then  examine  the  quality  of  the 
approximation.  The  actual  approximation  is  time  dependent  and  also  denends  on 
past  inflows.  Moreover,  it  depends  on  data  that  is  usually  not  available  in 
a longitudinal  model.  Nevertheless,  the  approximation  does  help  us  to  describe 
the  rational  limits  of  approximating  a longitudinal  model  with  a cross-sectional 
model . 

Recall  that  S(t)  is  an  N-dim  msional  random  vector,  where  S^(t)  is  a random 
variable  which  gives  the  stocks  in  class  i at  time  t.  The  expected  stocks 
in  each  class  are  given  by  the  elements  of  s(t)  = E[S(t)].  For  a (possibly 
nonstationary)  cross-sectional  model  the  conditional  expected  value  of  ' S(t  + 1) 
given  both  the  realized  values  of  stocks  S(t)  at  time  t and  the  (expected) 
inflows  f^(t  + 1)  in  period  t + 1,  is  easily  derived  from  equation  (2)  in 
II. 2.  Let  the  superscript  c represent  the  "cross-sectional  model.  ' Then 


(4) 


E [S(t  + l)|S(t)  = x]  = Q(t)x  fp(t  + 1)  . 


Note  that  we  use  Q(t)  to  indicate  that  the  transition  matrix  can  be  non-stat ionary 
from  period  to  period. 

The  basic  longitudinal  moael  gives  the  unconditional  expected  values  of 
S(t  + 1).  From  equation  (4)  in  III. 2. 


- — 


To  compare  the  longitudinal  and  cross-sectional  models  we  must  derive  an 
expression  for  the  conditional  expectation  E^[S(t  i-  l)|s(t)  = x] , where  the 
superscript  £ denotes  "longitudinal  model."  In  order  to  determine  this  expression 
some  assumptions  must  be  made  on  individual  behavior  and  some  results  of  probability 
theory  exploited. 

The  longitudinal  model  stipulates  that  each  individual  in  the  system  is  subject 
to  a stochastic  law  of  motion  that  depends  only  on  the  individual's  chain  and 
elapsed  time  in  the  system.  In  particular,  the  movement  of  any  given  individual 
is  independent  of  the  movement  of  others. 

With  each  individual  who  enters  the  system  we  associate  a counting  random 
variable.  Let 


Zp^(t  - u,t)  = 1 if  individual  j,  who  entered  in  chain  k in  period  t - u 

is  in  class  i at  time  t, 

= 0 otherwise. 

Recall  that  is  the  total  number  who  enter  in  chain  k in  period  u.  Then 

the  stock  in  class  i at  time  t is  the  random  variable 

K 

(j) 


(6) 


Si(t)  = I I I (t  - u,t) 

k=l  u=0  j=l 


The  central  limit  theorem  of  probability  theory  states  that  under  our  assump- 
tions S^(t)  has  approximately  a normal  distribution.  Also  the  elements  of  the 
N-vector  S(t)  are  jointly  normally  distributed,  and  the  elements  of  the  2N-vector 
(S(t),S(t  + 1))  are  also  jointly  normally  distributed. 

Now  let  b^^  = Cov[S^(t) ,Sj (t) ] , where  Cov  Indicates  covariance.  Also 

let  = Cov[ Sj (t) ,S^(t  +1)].  The  matrices  B and  C,  with  (i,j)-th  elements 

equal  to  b^^  and  c^^  respectively,  are  N x n covariance  matrices.  From  the  theory  of 

multivariate  normal  distributions  we  can  now  write  down  the  expression  for  the 


condlticual  expectation,  namely, 

(7)  E^S(t+l)|s(t)=x]  = C(t)B"\t)x  + P(0)g(t+1)  + [s(t+l). 

This  complicated  expression  reduces  to 


(C(t)B  (t)s(t)+P(0)g(t+l))] 


E [S(t  + l)|s(t)  . xl  - s(t  + 1)  + C(r.)B'ht)[x  - e(t)l  , 

so  that  »hen  x - s(t).  the  forecaat  reduces  to  s(t  + 1). 

Bsfote  „e  cao  compare  the  forecasts  obtained  In  (4)  and  (7)  It  Is  necessary 
to  analyse  the  covariance  matrices  B(t)  and  cit).  First  consider  B(t).  Using 
the  expression  ic  (6)  with  the  definition  of  covariance  one  can  show  that 

hll(t)  . s (t)  -II  Pi|,(u)g  (t  - u)  . 

u=0  k=l  ^ 

K 

' "u^O  Jl  ''Jk‘“’'’lk<“'s^<t  - u)  , for  1 O . 

Now  let  M(t)  be  an  N x n matrix  with  off-dlagonal  elements  equal  to  0 and 
">u(t)  = Sj(t).  Let  G(t-u)  be  a similar  Kxg  matrix  but  with  g^^(t-u). 
g^Ct  - u).  Then  the  matrix  B(t)  can  be  written  as 

- I P(u)G(t  - u)P’  (u) 
u=0 

Recall  that  the  prime  indicates  matrix  transposition. 

We  now  turn  to  analyzing  the  matrix  C(t).  Since  (t)  is  a covariance 

term  between  stocks  in  class  i at  time  t and  stocks  in  class  j at  t -b  1 

it  is  necessary  to  know  the  joint  distribu'-ion  nf  ^ r 

aistribb.xon  of  the  class  of  an  individual  at 

both  t and  t + 1. 

Define 

. (u)  = Prol)  I i at  t and  entered  chain  k in  1 

J [in  class  j at  t + 1 period  t + 1 - u j ' 

Later  in  this  section  these  joint  probabilities  are  discussed  in  detail  and  related 

to  results  in  IIl.lO.  Continuing  with  our  analysis  of  C(t)  it  follows  from 


this  definition  of  f^ . (u)  that,  if  f^jCt+l)  is  the  expected  flow  from 
class  i to  class  j in  period  t + 1, 

“>  K 


^ij^t  + 1)  = I I f + 1) 

u=0  k=l  ^ 


Using  (6)  and  the  definition  of  covariance  it  can  be  shown  that 

oo 

C(t)  = F'(t  1)  - I P (u  + l)G(t  - u)P(u)  , 


u=0 

where  F(t+1)  is  the  N x k;  matrix  of  expected  flows  [f^.(t  + 1)].  Next,  recall 
that  qj^(:)  is  the  fractj  of  those  individuals  in  class  i at  time  t 
move  to  riass  j at  t + 1.  Thus 


who 


(10) 


or  in  matrix  form, 


= fij(t  + l)/s.(t)  , 


(10) 


Q(t)  = F'(t  + DM  ^(t)  . 


Now  clearly  the  stocks  in  class  j at  time  t + 1 are  given  by  the  sum 
of  all  flows  into  class  j in  period  t + 1.  Thus 

+ 1)  ■ 

Using  (10)  and  substituting  fcr  the  input  chain  flows. 


s (t  + 1)  = I q (t)s  (t)  + V p (0)g  (t  + 1)  . 

i=l  1 k=l 

In  matrix  form  this  becomes 


(11) 


s(t  + 1)  = Q(t)s(t)  + P(0)g(t  + 1)  . 


Equation  (11)  could  have  been  obtained  from  (4)  directly,  but  by  fallacious 
reasoning.  Recall  that  ocr  assumption  is  that  the  longitudinal  model  truly 
describes  movement  through  the  system,  whereas  (4)  is  simply  a cross-sectional 
representation  which  approximates  the  true  model. 


By  subtracting  (7)  from  (4)  and  substituting  (11)  one  finds  that 
(12)  E'^[s(t+l)|s(t)=x]  - E^[S(t+l)  |s(t)=x]  = [C(t)B"^(t)-Q(t)]  (s(t)-x)  . 

Equation  (12)  gives  the  one-period  forecasting  error  caused  by  using  the 


cross-oection  model  in  place  of  the  longitudinal  model.  By  taking  expec- 
tations on  S(t)  we  see  that  "on  the  average"  the  expected  error  is  zero  in  every 


class. 


In  order  to  say  more  about  the  size  of  the  discrepancy  between  the  two  models 
it  is  necessary  to  know  something  about  the  magnitude  of  the  entries  in  the  matrix 


[C(t)P"-'(t)  - Q(t)].  Let 


D(t)  = I P'(u  + l)G(t  - u)P(u) 
u=0 


H(t)  = [ P(u)G(t  - u)P' (u)  . 

u=0 


Then  from  (8)  and  (9)  we  have 


B(t)  - M(t)  - H(t) 


C(t)  = F'(t  + 1)  - D(t) 


From  these  equations  together  with  (10)  it  can  be  shown  that 


C(t)B  ^(t)  - Q(t)  = [Q(t)H(t)  - D(t)]B"^(t) 


Problem  10: 
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a)  Verify  equation  (13). 

B)  Show  that  if  P(u  + 1)  = Q(t)P(u)  for  all  u > 0,  then  C(t)B~^(t)  - 
Q(t)  = 0 and  the  two  models  coincide. 

o 

To  investigate  (13)  further  we  consider  the  one  class,  one  chain  model  with 
constant  input.  In  this  case  all  matrices  and  vectors  reduce  to  scalars, 

§(t)  — g for  all  t,  and  P(u)  = p(u).  Moreover 

r 2 ” 

H = g 1 p(u)  , s = M = g I p(u)  , 
u=0  u=0 

“ oo 

= g I p(u)  and  D = g I p(u)p(u  + 1)  . 

u=l  u=0 

: expected  lifetime  of  an  individual  in  the  system.  Then 
r 2 r 

I p(u)  I p(u  + 1)  - I p(u  + l)p(u)  [ p(u)  , 

u>0  u>0  U>0  u>0 

The  term  in  parenthesis  in  (14)  is 

V 2 

I p(u)  (A  - 1)  - A [ p(u)p(u  + 1)  = A I A(u  + l)p(u)  - y p(u)^ 

^-0  u>0  u>0  u>0 

where  A(u  + 1)  = p(u)  - p(u  + 1). 

Interpreting  p(u)  as  the  tail  distribution  of  a non-negative  random  variable, 
say  A for  "lifetime,'’  one  can  show  that 


Let  ^ = 5^  p (u)  , th< 

u=0 


(14) 


QH  - D = f 

A 


(15) 

and 


I p(u)[l  - p(u)]  = I A(u)  I p(v)  , 
u>0  uiO  v>u 


(16) 


[A(u)  + A(u  + l)]p(u)  = 1 , 

u>0 


Using  (15)  and  (16)  in  (14)  gives 

(17)  QH  - D = I I A(u)  r I 


p(v)  - (A)p(u) 


Let  us  assume  now  that  the  expected  remaining  lifetime  of  a person  whose 
time  in  the  system  exceeds  u time  periods  is  no  more  than  the  expected  lifetime 
X of  a new  input.  We  say  that  people  have  "mean  residual  life"  bounded  above 
by  the  original  mean  life,  and  say  that  A has  MRLA  if 


^ n “ 0,1,2,..,  for  which  p(u)  ^ 0 . 


Note  that  equality  holds  in  this  equation  for  the  geometric  distribution.  Table 
IV.  1 shows  that  in  a particular  case  of  students  attending  the  University  of 
California  at  Berkeley,  (see  Table  11.15  also)  this  assumption  is  valid. 

Under  the  MRLA  assumption,  from  (17)  we  see  that 


QH  - D < 0 


In  th>?  stationary  case  [QH  - D]B  [s  - xj  is  independent  of  t, 


Since  B is  nonnegative,  we  have  the  following  conclusions: 

If  we  assume  A has  MRLA, 

a)  If  X < s j the  cross-sectional  model  under-estimates  the  value  .of 
E^[S(t  + D|s(t)  = x] . 

b)  If  X > s,  the  cross-sectional  model  over-estimates  the  value  of 
E^'[S(t  + 1)  |s(t)  = x]  . 

Since  S(t)  has  a marginal  normal  distribution  we  can  say  more  about  the 

expected  error  in  the  one  dimensional  case.  The  error  is  a normal  random  variable 

2 -1 

with  zero  mean,  and  variance  equal  to  (QH-D)  B (where  these  are  all  scalars). 
Thus  we  can  say  that  with  probability  about  .95  the  error  will  lie  in  the  interval 
(-2B  I QH-D 1,  + 2B  [QH-Dj).  The  length  of  this  interval  increases  as  the 
square  root  of  g.  However,  s the  expected  value  of  S(t)  increases  as  g. 

Thus  the  interval  length  divided  by  s,  or  the  fractional  error  range,  decreases 


Lifetime 

(semesters) 

u 

Pr[A>u]  = p(u) 

1 

v>u 

'[  P(vi)/p(v) 
V2;U 

0 

1.000 

6.959 

6.96 

1 

0.972 

5.959 

6.1A 

2 

0.905 

A. 987 

5.52 

3 

0.756 

A. 082 

5.A2 

A 

0.68A 

3.326 

A. 86 

5 

0.593 

2.6A2 

A.A7 

6 

0.562 

2.0A9 

3.65 

7 

0.52A 

1.A87 

2.8A 

8 

0.A98 

.936 

1.88 

9 

0.199 

.A65 

2.3A 

10 

0.130 

.266 

2.05 

11 

0.050 

.136 

2.72 

12 

0.036 

.086 

2.39 

13 

0.017 

.050 

2.9A 

lA 

0.015 

.033 

2.20 

15 

0.011 

.018 

1.6A 

16 

0.007 

.007 

1.00 

Table  IV. 1.  Mean  Residual  Life  of  Freshman  Students  Entering 
U.C.  Berkeley  in  Fall  Semester,  1955. 


as  the  square  root  of  g.  So  as  g increases,  and  hence  s increases,  the  width 
of.  the  confidence  interval  of  error  increases  much  more  slowly.  To  illustrate 
this  we  use  the  lifetime  distribution  from  Table  IV. 1,  and  for  various  cohort 
sizes  we  show  how  the  interval  length  changes.  The  results  are  given  in  Table 
IV. 2.  It  is  clear  from  this  table  that  even  though  the  lifetime  distribution 
differs  considerably  from  a Markovian  (geometric)  distribution  with  the  same 
mean,  the  confidence  intervals  on  the  forecasting  erior  are  extremely  small 
relative  to  the  expected  number  in  the  system.  For  comparison  p(u)  is  drawn 
in  Figure  IV. 1 together  with  a geometric  distribution. 


i-i"' 


1955  UCB  students  (2126) 


I'xpected  Lifetime  X - 3.5  years 


1 1 1 

12  3 4 

5 

J 1 

6 7 8 9 

J 1 

10  11  12  13  14 

TO 

15  16 

Semesters  (u) 

Figure  fV.l.  Comparison  of 

P(u) 

for  UCB  Students 

with  a Geometric  Distribution 
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t'. 

i: 


1: 

y*; 

Cohort  Size 
g 

E[S]  = s 

Confidence  Interval 
for  Forecast  error 

E: 

1 

1000 

6,959 

(-7,7) 

1 

2000 

13,918 

(-10,10) 

k 

3000 

20,877 

(-12,12) 

[ 

4000 

27,836 

(-14,14) 

[ 

Table  IV. 2. 

95%  Confidence 

Intervals  for  Various 

Cohort  Sizes. 

1 


Determination  of  properties  of  the  matrix  in  (13)  for  the  multi-class,  multi- 
chain case  is  much  more  difficult  than  in  the  one-class,  one-chain  case.  A 4-class, 
4-chain  numerical  example  is  given  which  uses  the  student  enrollment  data  from 
Table  III. 3,  and  assuming  constant  cohort  size  input. 

The  forecasting  error  given  by  (12)  has  a multivariate  normal  distribution 
with  mean  0 and  covariance  matrix  (QH-D)  (B  ^)'(QH-D)*.  Using  the  data  given 
in  Table  III. 3 for  freshmen,  sophomores,  junior  and  seniors  at  the  University 
of  California,  Berkeley  1955-1969,  calculations  were  made  assuming  constant 
cohort  sizes  of  3000  freshmen,  700  sophomores,  1300  juniors  and  150  seniors  entering 
each  fall  semester.  These  figures  are  approximately  what  the  Berkeley  campus  had 
been  exp'>riencing  in  its  fall  new  admissions. 


th 


Table  IV.  3 gives  the  matrix  B,  whose  (j,i)  element  is  the  covariance  of 


S^(t)  and  some  t.  Also  included  is  s,  the  vector  of  expected 


stocks  in  each  class. 


i 


i 


^\Qlass  i 
Class  j 

Freshmen 

Sophomores 

Juniors 

Seniors 

Freshmen 

673 

-454 

-30 

-10 

Sophomores 

-454 

1453 

-380 

-43 

Jun  .ors 

-30 

-380 

2137 

-535 

Seniors 

-10 

-43 

-535 

2216 

Expected 

Values 

3868 

3324 

4687 

3227 

Table  IV. 3.  Covariance  Matrix  B for  the  A-class  example. 


The  variance  of  the  number  in  each  class  increases  as  the  class  increases, 


and  all  classes  are  negatively  correlated. 


Table  IV. 4 gives  the  matrix  (QH-F)B  (QH-F) ' , which  is  the  covariance 
matrix  of  the  forecasting  error.  It  can  be  seen  that  these  numbers  are  very 
small  compared  to  the  size  of  the  predicted  values,  as  was  found  in  the  single 
state  case. 


\.^l^ass  i 
Class  j 


Freshmen  Sophomores  Juniors  Seniors 


Freshmen 

6.7 

2.2 

-22.4 

-5.4 

Sophomores 

2.2 

1.0 

-8.5 

-2.7 

Juniors 

-22.4 

00 

82.2 

29.5 

Seniors 

-5.4 

-2.7 

29.5 

41.8 

Table  IV. 4.  Covariance  Matrix  of  Forecasting  Error. 


The  matrix  (QH-D)B  ^ is  given  in  Table  IV. 5. 
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\Qlas8  i 
Class 

Freshmen 

Sophomores 

Juniors 

Seniors 

Freshmen 

.068 

.013 

.002 

.001 

Sophomores 

-.041 

-.003 

.003 

.001 

Juniors 

.290 

-.062 

-.030 

.029 

Seniors 

.040 

-.046 

-.125 

.032 

Table 

IV. 5.  (QH-D)B  ^ for  the 

; 4-Class  Example. 

This  is  an  example  where  (QH-D)  is  neither  > nor  ^ 0,  unlike  the 

one-class,  one-chain  model. 

Even  though  movement  through  the  system  is  far  from  that  represented  by  a 
stationary  cross-section  model  (i.e.,  P(u)  4 for  some  Q) , when  constant  cohort 
sizes  are  used  the  cross-sectional  modeJ  gives  essentially  the  same  prediction  as 
the  more  complex  cross-sectional  model.  Hov^ever,  the  longitudinal  model  is  primarily 
formulated  for  forecasting  under  conditions  of  controlled  input.  This  is  often  the 
situation  when  policy  changes  are  implemented,  and  under  such  conditions  the  sizes 
of  cohorts  is  successive  time  periods  can  and  do  vary  considerably.  For  example, 
the  freshmen  cohorts  in  the  fall  quarters  at  Berkeley  in  the  period  1966-1969  are 
shown  in  Table  IV.  6.  This  was  a period  when  total  campus  enrollment  was  controlled. 


and  new  students  entered  only  to  fill  available  room. 


Ip'- 


I 
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Date 


Coliort  Size 


Fall  1966 
Fall  1967 
Fall  1968 
Fall  1969 


3,053 

3,303 

2,239 

1,883 


Table  IV. 6.  Freshmen  Cohort  Sizes  at  U.C.  Berkeley 


One  can  see  that,  since  F(t)  and  s(t)  are  both  functions  of  previous 
cohort  sizes  (up  to  period  t),  that  the  cross-sectional  transition  probabilities 
viLl  change  with  time,  ana  i hat  estimating  tliem  from  (■  ross-sec tiona  1 data  in  two 
consecutive  vears  will  not  account  for  gross  changes  in  cohort  Sj-es. 


We  end  this  section  with  a b:  !ef  discussion  of  the  joint  prooabl 1 ! ties  f^j(u) 


and  their  connection  with  the  flow  parameters  I II.  10  (longiludinal 


conservat ion) . 


First  it  is  easy  to  see  that  if  u = 0,  then  f^.  (0)  = and  for  i =f^  0, 

uj  J K. 


f . (0)  = 0.  These  relations  follow  directly  from  the  definitions.  Next,  since  any 

k. 

individual  who  leaves  the  syst.m  cannot  return,  for  j and  u z 1 f^^ (u)  = 0. 


Also,  by  looking  at  the  flows  into  some  state  j in  period  u it  fellows  that 


jk 


i=0 


•Similarly,  by  looking  at  flows  out  of  some  state  i in  period  (u  + 1)  , 

N , 


j=0 


Thus  the  marginals  of  the  joint  probabilities  {f^j(u)}.  In 


many  applications  the  {f^j(u)}  are  hard  to  measure  and  it  would  be  advantageous 


if  they  could  be  estimated  from  the  which  are  relatively  easy  to  measure. 


In  general  the  marginals  do  not  determine  the  joint  distributions. 


Problem  II:  The  longitudinal  model  would  have  serial  independence  if  f (u  + 1)  = 


Pik(u)pjk(u  -1-  1).  Since  people  who  leave  cannot  return,  f^^  (u)  = 0 for  all  u. 


Use  this  to  prove  that  we  cannot  have  serial  independence  in  the  longitudinal  model. 


A 


6.  Notes  and  Comments. 


The  material  in  section  3 is  based  on  Hayne  [1974]  and  Hayne  and  Marshall  [1974]. 
This  type  of  model  makes  it  possible  to  work  with  a highly  disaggregated  manpower 
classification  scheme  and  still  have  some  control  over  the  interpretation  and  mani- 
pulation of  the  model. 

The  semi-Markov  model  of  section  4 is  new.  The  reader  may  consult  Ross  [1970]  , 
and  references  cited  there,  for  a decription  of  semi-Markov  models.  Austin  [1971] 
and  Bartholomew  [1973]  discuss  semi-Markov  models.  The  treatment  in  section  4 is 
quite  different.  We  stress  approximations  that  can  be  obtained  from  the  transition 
probabilities,  and  the  first  two  moments  of  the  length  of  a visit. 

Section  5 is  based  on  Marshall  [1973].  It  reveals  the  underlying  structure 
of  the  longitudinal  models  and  reinforces  the  theoretical  notions  derived  in  section 
10  of  chapter  III 


Ati  ji  . ; 
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